Using data-display networks for exploratory data analysis in phylogenetic studies.

نویسنده

  • David A Morrison
چکیده

Exploratory data analysis (EDA) is a frequently undervalued part of data analysis in biology. It involves evaluating the characteristics of the data "before" proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a data-display network. This type of network is designed to display any character (or tree) conflict in a data set, without prior assumptions about the causes of those conflicts. The conflicts might be caused by 1) methodological issues in data collection or analysis, 2) homoplasy, or 3) horizontal gene flow of some sort. Here, I explore 13 published data sets using splits networks, as examples of using data-display networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phylogenetic networks: a new form of multivariate data summary for data mining and exploratory data analysis

Exploratory data analysis (EDA) involving both graphical displays and numerical summaries of data, is intended to evaluate the characteristics of the data as well as providing a form of datamining. Formultivariate data, the best-known visual summaries include discriminant analysis, ordination, and clustering, particularly metric ordinations such as principal components analysis. However, these ...

متن کامل

Region Directed Diffusion in Sensor Network Using Learning Automata:RDDLA

One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...

متن کامل

Region Directed Diffusion in Sensor Network Using Learning Automata:RDDLA

One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...

متن کامل

Phylogeny of some species of Astragalus (Fabaceae) based on morphological data

The phylogenetic relationships among 39 species belonging to 12 sections of Astragalus from Iran were studied on the basis of 29 morphological characters. The cladistics analysis of the morphological data was performed using PAUP* 4.0b10 program. The obtained data were compared with the molecular systematics data obtained from nuclear DNA ITS. In contrast with previous molecular systematic stud...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 27 5  شماره 

صفحات  -

تاریخ انتشار 2010